Aligning linguistically motivated phrases

نویسندگان

  • Lieve Macken
  • Walter Daelemans
چکیده

In this paper, we describe the architecture of a sub-sentential alignment system that links linguistically motivated phrases in parallel texts. We conceive our sub-sentential aligner as a cascade model consisting of two phases. In the first phase, anchor chunks are linked on the basis of lexical correspondences and syntactic similarity. In the second phase, we will focus on the more complex translational correspondences based on observed translation shift patterns. The anchor chunks of the first phase will be used to limit the search space in the second phase. We present the first results of our sub-sentential alignment system, which links linguistically motivated chunks. In our baseline system, the obtained recall scores range from 44% to 59% and precision scores from 90% to 98% depending on text type. We experimented with two different types of bilingual dictionaries to generate the lexical correspondences: a handcrafted bilingual dictionary and probabilistic bilingual dictionaries. We demonstrate that although the handcrafted dictionary is twice the size of the probabilistic dictionary, the obtained recall scores are lower. Proceedings of the 18th Meeting of Computational Linguistics in the Netherlands, pp. 37–52 Edited by: Suzan Verberne, Hans van Halteren, Peter-Arno Coppen. Copyright c ©2008 by the individual authors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linguistically-Based Sub-Sentential Alignment for Terminology Extraction from a Bilingual Automotive Corpus

We present a sub-sentential alignment system that links linguistically motivated phrases in parallel texts based on lexical correspondences and syntactic similarity. We compare the performance of our subsentential alignment system with different symmetrization heuristics that combine the GIZA++ alignments of both translation directions. We demonstrate that the aligned linguistically motivated p...

متن کامل

Normalization and Similarity Recognition of Complex Predicate Phrases Based on Linguistically-Motivated Evidence

................................................................................................................................ i LIST OF TABLES ................................................................................................................... viii LIST OF FIGURES .....................................................................................................................

متن کامل

Towards a linguistically motivated computational grammar for Hebrew

While the morphology of Modern Hebrew is well accounted for computationally, there are few computational grammars describing the syntax of the language. Existing grammars are scarcely based on solid linguistic grounds: they do not conform to any particular linguistic theory and do not provide a linguistically plausible analysis for the data they cover. This paper presents a first attempt toward...

متن کامل

General estimation and evaluation of compositional distributional semantic models

In recent years, there has been widespread interest in compositional distributional semantic models (cDSMs), that derive meaning representations for phrases from their parts. We present an evaluation of alternative cDSMs under truly comparable conditions. In particular, we extend the idea of Baroni and Zamparelli (2010) and Guevara (2010) to use corpus-extracted examples of the target phrases f...

متن کامل

Linguistically Motivated Parallel Parsebanks

Parallel grammars and parallel treebanks can be a useful method for studying linguistic diversity and commonality. We use this approach to study how arguments to similar predicates are realized across languages. To that end, we formulate formal principles for aligning at phrase and word levels based on translational correspondences at predicate-argument level. A first version of a new tool for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008